Transcriptomics of developing wild sunflower seeds from the extreme ends of a latitudinal gradient differing in seed oil composition

Abstract Seed oil composition, an important agronomic trait in cultivated sunflower, varies latitudinally across the native range of its wild progenitor. This pattern is thought to be driven by selection for a higher proportion of saturated fatty acids in southern populations compared with northern populations, likely due to the different temperatures experienced during seed germination. To investigate whether these differences in fatty acid composition between northern and southern populations correspond to transcriptional variation in the expression of genes involved in fatty acid metabolism, we sequenced RNA from developing seeds of sunflowers from Texas, USA, and Saskatchewan, Canada (the extreme ends of sunflower's latitudinal range) grown in a common garden. We found 4,741 genes to be differentially expressed between Texas and Canada, including several genes involved in lipid metabolism. Several differentially expressed lipid metabolism genes also colocalized with known oil quantitative trait loci (QTL). The genes producing stearoyl‐ACP‐desaturases (SAD) were of particular interest because of their known role in the conversion of fully saturated into unsaturated fatty acids. Two SAD genes were more highly expressed in seeds from Canadian populations, consistent with the observation of increased levels of unsaturated fatty acids in seeds from that region. We also constructed a gene co‐expression network to investigate regional variation in network modules. The results of this analysis revealed regional differentiation for eight of 12 modules but no clear relationship with oil biosynthesis. Overall, the differential expression of SAD genes offers a partial explanation for the observed differences in seed oil composition between Texas and Canada, while the expression patterns of other metabolic genes suggest complex regulation of fatty acid production and usage across latitudes.


| INTRODUCTION
Clinal patterns of phenotypic variation are common in nature and often track environmental gradients such as latitude or elevation (De Frenne et al., 2013;Halbritter et al., 2018). Such clines are often attributed to local adaptation, particularly for traits closely associated with life history, due to the importance of matching life history transitions with the local environment (De Frenne et al., 2013;Donohue et al., 2010;Fabian et al., 2015;Stinchcombe et al., 2004). Given the likely importance of traits that covary with environmental clines, there is particular interest in dissecting their genetic basis. Such information can provide insight into the processes that influence the distribution of adaptive genetic variants across species ranges and tolerance to stressful conditions, especially in the context of climate change (Ahuja et al., 2010;Atkins & Travis, 2010).
A particularly intriguing example of clinal trait variation in plants relates to seed oil composition (Linder, 2000;Sanyal et al., 2018).
Seed oils are composed of both saturated and unsaturated fatty acids (FAs), the relative proportions of which are thought to influence germination timing and the amount of energy available to seedlings across temperatures (Linder, 2000). While saturated FAs provide more energy than unsaturated FAs, the lower melting point of the unsaturated FAs increases energy availability at cooler temperatures. Notably, a negative relationship between latitude and the proportion of saturated FAs in seeds has been found for several taxa (Linder, 2000;Sanyal et al., 2018). Seed oils from tropical plant species near the equator tend to have significantly greater proportions of saturated FAs than temperate plants in higher latitudes (Linder, 2000;Manos & Stone, 2001). It has thus been suggested that variation in seed oil composition is an adaptation to the temperatures most often experienced during germination (Linder, 2000;Sanyal & Decocq, 2016;Sanyal & Linder, 2013).

One of the best-studied examples of clinal variation in seed oil
composition is in wild sunflower (Helianthus annuus L.). Across their native range, wild sunflower populations vary latitudinally in the proportion of saturated FAs in their seed oil; that is, southern populations tend to produce seeds with a greater proportion of saturated FAs than northern populations (Figure 1; Linder, 2000;McAssey et al., 2016).
Given that this pattern is maintained in a common garden, it seemingly results from genetic differentiation across the species range. Evidence for a trade-off in germination timing at high and low temperatures between populations from the extreme ends of the gradient has also been observed, consistent with the idea that natural selection on oil composition (due to its effects on germination and/or seedling establishment) is responsible for producing this pattern (Linder, 2000).
In addition to being a model for ecological studies, wild sunflower is the progenitor of cultivated sunflower (also H. annuus), which is one of the world's most important oilseed crops. Cultivated sunflower was initially domesticated over 4,000 years ago in what is now the eastcentral USA as a source of edible seeds and for cultural purposes (Heiser, 1978;Park & Burke, 2020;Rieseberg & Seiler, 1990) and was F I G U R E 1 The proportion of saturated fatty acids in seed oils of wild sunflower varies along a latitudinal gradient. (a) Populations of wild sunflowers included in the work of McAssey et al. (2016). Orange and blue dots represent the populations from Texas and Canada that were included in this study. Gray points represent populations studied by McAssey et al. (2016) but not included in this study. (b) Concentration of saturated fatty acids in seed oils from wild sunflower populations across a latitudinal gradient. Samples from Texas are colored orange and grouped within an orange box, while samples from Canada are colored blue and grouped within a blue box. The proportion of saturated fatty acids is significantly higher in seeds from Texas populations than from Canada populations as described in McAssey et al. (2016) and denoted by an asterisk. The seeds used for RNA extraction in this study were produced from the same crosses as those used for oil phenotyping in McAssey et al. (2016). later subjected to intense selection for increased seed oil concentration, as well as a shift in FA composition towards greater unsaturated FA content (Fick & Miller, 1997). Sunflower oil is a naturally rich source of oleic and linoleic acids (18:1 and 18:2 respectively, where the numbers represent the length of the carbon chain and the number of double bonds present [i.e., the degree of FA unsaturation]). These two FAs combine to account for approximately 85%-90% of the total seed oil content, with the remainder being largely made up of the fully saturated palmitic (16:0) and stearic (18:0) acids. In recent years, breeding efforts have increasingly focused on the development of high-oleic lines that produce seed oil composed of ≥ 80% oleic acid, which is valued for its health benefits and stability during storage (Miller et al., 1987).
Due to the ecological and economic importance of seed oil composition and the general importance of FAs in biological systems, the FA metabolic pathway has been well-characterized ( Figure 2; Bates et al., 2013;Jung et al., 2019;Ohlrogge et al., 1991). In plants, palmitic (16:0) and stearic (18:0) acid are the first long-chain FAs produced during FA synthesis. Stearoyl-ACP desaturases (SAD) then convert palmitic and stearic acid into monounsaturated palmitoleic (16:1) and oleic (18:1) acid, though the conversion of palmitic acid to palmitoleic acid is relatively minor compared with the conversion of stearic acid to oleic acid. Fatty acid desaturases (FAD) then convert oleic acid into linoleic acid and other polyunsaturated FAs. The role of the enzymeproducing SAD and FAD genes in controlling oil composition has been well established in many plant species (Belide et al., 2012;Dar et al., 2017;Fofana et al., 2006;Liu et al., 2002;Rajwade et al., 2014;Thambugala & Cloutier, 2014). In cultivated sunflower, the regulation of these genes is responsible for many commercially important seed oil phenotypes, including the aforementioned high-oleic acid phenotype, which is conditioned by a mutation resulting in the downregulation of the seed-specific FAD2-1 gene (Hongtrakul et al., 1998; F I G U R E 2 Simplified overview of the plant fatty acid biosynthetic pathway. The first step of fatty acid (FA) biosynthesis in the plastid is the conversion of Acetyl-coenzyme A (CoA) to malonyl-CoA by acteyl-CoA carboxylase (ACCase) which is then further converted to malonyl acylcarrier protein (ACP) by malonyl coenzyme A:acyl carrier protein transacylase (MCAMT). From there, fatty acid synthase (FASII complex, a multienzyme complex made up of several fatty acid biosynthesis genes) catalyzes several elongation steps to produce 16:0-ACP. Ketoacyl-synthase II (KASII, often considered a component of the FAS complex) further elongates 16:0-ACP into 18:0-ACP. SAD then removes the first double bond from these molecules to form a 16:1-ACP and 18:1-ACP (SAD exhibits a strong substrate preference for 18:0-ACP over 16:0-ACP). FatB and FatA remove the acyl-carrier protein to form palmitic, stearic, and oleic acid. Long-chain acyl-CoA synthetase (LACS) converts free FAs into acyl-CoAs which are then transferred to the endoplasmic reticulum where lysophosphatidylcholine acyltransferase (LPCAT) removes the CoA and replaces it with phosphatidylcholine (PC). FAD2 then desaturates 18:1-PC into 18:2-PC before FAD3 further desaturates that into 18:3-PC. Those polyunsaturated fatty acid (PUFA)-PC molecules are then converted into PUFA-CoA by the combined activity of phospholipase A2 (PLA 2 ) and LACS4/9 before finally being converted into linoleic and alpha-linoleic acid. PUFA-PC, PUFA-CoA, and free PUFAs in the ER can all be used to produce other lipid-based molecules. Oleic acid can also be converted into linoleic and alpha-linoleic acid in the plastid through the activity of FAD6 and FAD7/8 respectively, but this is not the primary pathway for FA desaturation (Bates et al., 2013;Jung et al., 2019;Ohlrogge et al., 1991Ohlrogge et al., ). et al., 1987Schuppert et al., 2006). Similarly, increased stearic acid content has been mapped to quantitative trait loci (QTL) containing SAD genes (Pérez-Vich et al., 2004. While regulation of SAD and FAD genes underlie these commercially important phenotypes in cultivated sunflower, less is known about how these and related genes contribute to observed variation in seed oil composition in wild populations.  Figure 1). Overall, several genes related to oil metabolism are differentially expressed across the range of wild sunflower. We found that SAD genes specifically are expressed at a significantly higher level in developing seeds from Canadian populations when compared with those from Texan populations, consistent with our expectations given the observed higher levels of desaturation at more northern latitudes. Additionally, many of the DEGs related to oil metabolism were found within oil QTL. Contrary to expectations, however, DEGs related to oil metabolism did not group into a single co-expression network module associated with region, which suggests that oil metabolism genes do not act in a coordinated fashion to produce the observed seed oil phenotypes. Taken together, our results suggest that the regulation of seed oil metabolism across latitudes is a complex process that is influenced by many different genes.

| Plant Materials
Seeds used in this study were produced via intrapopulation crosses in a common greenhouse environment during a previous experiment (McAssey et al., 2016). Because both RNA extraction and phenotyping for oil composition are destructive, we were unable to obtain both phenotypic and transcriptomic data from the exact same seed samples. Given this necessary limitation and to make our data as comparable as possible, we extracted RNA from seeds produced by the same crosses and collected at the same Briefly, seeds from six populations representing the southern and northern ends of the range of wild sunflower (three from Texas, USA, and three from Saskatchewan, Canada; Figure 1 and Table 1) were germinated, allowed to establish, and then transplanted into pots in the greenhouse where they were arranged in a randomized fashion and grown to flowering. Four biological replicates per population were grown and sampled; however, not all samples passed quality control after RNA extraction, as detailed below. Flower heads were bagged once buds began to develop to prevent unintended cross-pollination.
Because wild sunflowers are self-incompatible, pairs of individuals originating from the same population were manually cross-pollinated.
This involved removing the pollination bags from the two focal plants, collecting pollen from both plants directly into a petri dish, and then using a paintbrush to apply this pollen to receptive florets of the same two plants. Because florets within an inflorescence open over time, pollination was performed over a period of days for each individual.
Following pollination, heads were rebagged to prevent contamination and plants were allowed to set seed. Fifteen days after the final pollination, eight developing achenes (i.e., single-seeded fruits) were collected from the center of each head and pooled together as a single sample for each biological replicate to ensure adequate tissue quantity for RNA extractions. The eight achenes were then placed into 1.5 ml tubes, and frozen in liquid nitrogen. These tubes were then stored at À80 C until RNA extraction.
T A B L E 1 Accession numbers, location data, and number of maternal plants contributed seed (biological replicates) that passed RNAseq quality thresholds for use in this study

| RNA extraction and library construction
The eight frozen achenes from each biological replicate were ground with a mortar and pestle using liquid nitrogen and approximately 1 g of polyvinylpolypyrrolidone (PVPP). The ground tissue was then transferred into a tube and placed in liquid nitrogen to keep it frozen. After removing individual samples from the liquid nitrogen for processing, 1 ml of Trizol was added to each tube. The contents were then mixed and allowed to incubate at room temperature for 5 min. Chloroform (300 μl) was then added to each sample, and the tubes were manually shaken and then centrifuged at 12,000 Â G for 10 min. The resulting RNAseq data were processed using a custom bioinformatics pipeline (https://github.com/EDitt/Sunflower_RNAseq).
Adapter sequences were first trimmed using Trimmomatic v0.36 with default settings (Bolger et al., 2014). Next, because samples were split across four lanes during sequencing, reads from each lane were separately mapped to the XRQv1 sunflower genome assembly (Badouin et al., 2017) using the two-pass mapping method implemented in STAR v2.6.1c (Dobin et al., 2013). The program RSEM v1.3.1 (Li & Dewey, 2011) was then used to calculate expression levels for each gene in each sample prior to downstream analysis.
Only genes that were expressed at a threshold of at least one countper-million in two or more samples per region were retained for this analysis. To correct for the difference in library sizes between samples, raw counts were normalized to the trimmed mean of M-values as recommended by the edgeR user guide. Differential expression analyses are not affected by gene-length bias as the analyses are only concerned with the relative abundance of individual transcripts between samples, so no gene-length correction was used during this part of the analysis. The model matrix used to estimate dispersions incorporated source population and region as factors. In addition, we tested for differential expression between population pairs within Texas and Canada, except for TEX2, which only had one sample. The false discovery rate was controlled for by applying a Benjamini-Hochberg correction (Benjamini & Hochberg, 1995) to the set of P-values calculated by edgeR. Due to prior knowledge of the influence of SAD and FAD genes on FA saturation, we conducted independent tests of differential expression of these genes by comparing the transcripts-per-million for each gene between the Texan and Canadian samples using a Wilcoxon signed-rank test (Wilcoxon, 1945).
Gene ontology (GO) terms for each of the sunflower genes were obtained from the blast2go (Conesa & Götz, 2008) (Wallenius, 1963) after which the false discovery rate was controlled for by applying a Benjamini-Hochberg correction to the raw P-values (Benjamini & Hochberg, 1995).

| QTL colocalization
We determined the locations of DEGs relative to the positions of previously mapped QTL related to FA biosynthesis that were compiled in

| Co-expression network analysis
We built a co-expression network using CemiTool v3.12 (Russo et al., 2018) in R to test for associations between geographic region (i.e., Texas vs. Canada) and individual co-expression modules. We generated a signed network using Pearson correlation coefficients based on transcripts-per-million normalized counts for all genes that passed the minimum expression threshold for the differential expression analysis. Invariant genes were filtered out from module construction using default settings, and modules were then tested for their association with region using gene-set enrichment analysis as implemented by CemiTool using the package FGSEA (Korotkevich et al., 2021). Hub genes were identified as the top five most connected genes within each module. Co-expression modules were then tested for enrichment of GO terms using CemiTool's overrepresentation analysis.

| Identification of lipid metabolism genes
To determine which genes in the sunflower genome are involved in oil biosynthesis and overall lipid metabolism, we used a FASTA file containing all genes in the sunflower genome (Badouin et al., 2017) as an input for the Mercator4 tool. This tool is included in the MapMan4 software suite and uses a BLAST based approach to compare gene sequences to those of a curated set of reference proteins (Schwacke et al., 2019). This curated set of proteins is organized into several pathways; genes mapping to the lipid metabolism pathway were identified and then cross-referenced with the results of our differential expression analysis.  Figure S1). Just over half ( As these individuals were grown in a common garden, variation in gene expression is likely due to genetic differences between

F I G U R E 3 (a) Multidimensional scaling plot illustrating variability between samples/locations. (b) Barplot showing the number of differentially expressed genes between samples from Texas (TEX) and
Canada (CAN) as well as the number of differentially expressed genes between individual populations within those regions. Each bar represents the number of upregulated genes in that population relative to the other population in the comparison (i.e., in the CANvTEX comparison, there are 2,897 genes upregulated in CAN relative to TEX and 1,844 genes upregulated in TEX relative to CAN).
populations. The clear differences in gene expression between populations from Texas and Canada were expected based on evidence that wild sunflower exhibits north-south divergence in terms of both genetic and phenotypic variation (Blackman et al., 2011;McAssey et al., 2016;Park & Burke, 2020). These patterns likely result, at least in part, from selection for adaptation to the drastic environmental differences between Texas and Canada; such selection might also have contributed to variation in the expression of genes related to environmental adaptation (Akman et al., 2016;Lasky et al., 2014). The CAN1 population is geographically more distant from CAN2 and CAN3 than CAN2 and CAN3 are from each other (Figure 1a) which may result in increased genetic differentiation between CAN1 and the other CAN populations, thereby resulting in a greater number of DEGs.

| GO term enrichment within DEGs
Among the 4,741 genes that were differentially expressed in samples from Texas versus Canada, 16 GO terms were found to be significantly enriched (supporting information Data Set S3 and Table 2).
Most of the enriched GO terms fell into the molecular function ontology category, with fewer falling into the biological process category, and only a single term being a part of the cellular component category.
The enrichment of the single cellular component term, membrane (GO:0016020), could be related to differences in seed oil composition, as the FA composition of membranes is critical for maintaining membrane fluidity at different temperatures (Falcone et al., 2004); however, this is a very high-level GO term and its relationship to the FA profiles of the seed samples in this analysis is merely speculative. GO terms directly related to FA production, such as lipid metabolism (GO:0006629), omega-6 FAD activity (GO:0045485), stearoyl-ACPdesaturase activity (GO:0102786), and several others, were not found to be enriched within this set of DEGs, even prior to the multiple test correction. We also tested for GO term enrichment among genes upregulated in each region specifically and for each population comparison, but most of these enrichment tests did not yield noteworthy results (supporting information Data Set S3). The lone exception was among genes upregulated in CAN1 versus CAN3 where there was enrichment for genes related to the fatty acid biosynthesis GO term (GO: 0006633). This was somewhat surprising given that the enrichment of this GO term is based upon a comparison between two CAN populations and suggests that overall differences in the expression of genes related to FA biosynthesis between TEX and CAN populations might be driven primarily by the expression of FA biosynthesis genes in CAN1 samples.

|
Are genes related to oil metabolism, particularly members of the SAD and FAD gene families, differentially expressed between regions?

| Expression patterns of SAD and FAD genes
Given their known effects on FA saturation, the expression patterns of SAD and FAD genes were examined in greater detail to determine if they showed evidence of differential regulation consistent with observed phenotypic differences between regions. Two SAD genes and 37 FAD genes were expressed in our developing seed samples; T A B L E 2 GO term enrichment among genes differentially expressed between Texas and Canada however, one FAD gene on chromosome 14 was expressed at substantially higher levels than the other 36 FAD genes and likely contributes disproportionately to overall FAD enzyme activity.
We conducted independent tests of differential expression on SAD and FAD genes based upon our prior knowledge of the role of these genes in controlling the conversion of saturated FAs into monounsaturated FAs and, subsequently, into polyunsaturated FAs ( Figure 2; SAD and FAD genes respectively). Using a Wilcoxon rank sum test to compare differences in the TPM of the two SAD genes and most highly expressed FAD gene between TEX and CAN, we found that there were significant differences in expression for all three genes between regions (Figure 4). Both SAD genes were expressed at significantly higher level in CAN versus TEX, while the most highly expressed FAD gene was expressed at a significantly higher level in TEX.
Based on the functional role of SAD genes reported in other studies (e.g., Cantisán et al., 2000;Pérez-vich et al., 2002;Pleite et al., 2006;Salas et al., 2004), the increased SAD gene expression in developing seeds from Canada compared with Texas is consistent with the observed phenotypic patterns showing higher unsaturated FA content in seeds from this portion of the range (McAssey et al., 2016). Because the transition from fully saturated FAs to monounsaturated FAs is solely mediated by the SAD enzyme (Figure 2), it is thus reasonable to expect that increased expression of SAD genes would potentially increase the total number of SAD enzymes produced and therefore convert more saturated FAs into monounsaturated FAs through increased enzymatic activity. While our data cannot definitively show that increased SAD gene expression in Canadian sunflowers is the cause of higher seed unsaturated FA content, our results provide strong evidence for future investigation.

| Differential expression analysis of other genes in the FA biosynthesis pathway
While the SAD and FAD genes are primarily responsible for modulating the degree of FA (de)saturation in seed oils (Bates et al., 2013), we also examined the expression of genes within the FA biosynthesis and overall lipid metabolism pathways more broadly. Using the Mercator4 tool, we found 850 genes that mapped to the lipid metabolism pathway curated as part of the MapMan4 software suite. Of those 850 genes, 571 of them passed the minimum expression threshold to be included in our tests for differential expression. Of those 571 expressed genes, 55 of them were differentially expressed between Texas and Canada (supporting information Data Set S4).
Aside from several differentially expressed FAD genes, few of these DEGs are directly involved in the FA biosynthesis pathway outlined in  (Gu et al., 2020;Millar & Kunst, 1997). Drought stress is known to trigger production of these very-long-chain FAs, and while they do not affect the overall degree of seed oil saturation, expression of these genes can alter the levels of individual FAs within seed oils (Gu et al., 2020;Guo et al., 2020;Hashim et al., 1993). Members of the ALA-ALIS flippase complex are also differentially expressed and are involved in the translocation of phospholipids between membrane bilayers (Nintemann et al., 2019).
Overall, the fact that several lipid metabolism genes are differentially expressed between regions suggests that there is complex regulation and usage of FAs that go beyond a simple shift in saturated:unsaturated FA ratios and that might be involved in adaptation to the specific environmental conditions experienced in Texas versus Canada.
3.3 | Are DEGs related to oil metabolism found within oil QTL?

| DEG colocalization with oil QTL
We also asked if DEGs were enriched within known sunflower oil QTL from previous mapping studies. We determined the genomic location of DEGs relative to 51 oil QTL from previously published mapping studies in cultivated sunflower and eight oil QTL from a wild Â cultivated sunflower cross (Badouin et al., 2017;Burke et al., 2005;Ebrahimi et al., 2008;Pérez-Vich et al., 2016;Premnath et al., 2016). There were 5,145 expressed genes located within oil QTL, 759 of which were DEGs (supporting information Data Set S5).
Among all oil QTL, DEGs were significantly under-enriched as F I G U R E 4 Comparison of expression levels of the two SAD genes and the most highly expressed FAD gene (FAD2-1) among individuals from Texas and Canada. Gene expression is scaled to transcripts-per-million. P-values are calculated from a Wilcoxon signed-rank test. Error bars represent one standard error from the mean. compared with random expectation (À1.08-fold enrichment, 828.97 expected DEGs; P-value = .006), while within any individual oil QTL, DEGs were not significantly enriched.
We also asked whether any of the lipid metabolism genes colocalized with any oil QTL. Ten lipid metabolism genes were found within oil QTL; three genes colocalized with the QTL for oleic acid content on chromosome 1 from the wild Â cultivated cross, two genes colocalized with the QTL for oleic acid content on chromosome 3 from the wild Â cultivated cross, and the remaining five genes individually colocalized with five additional QTL (Table 3). Four of these 10 genes were found to be differentially expressed. One differentially expressed FAD gene is found within the oleic acid QTL on chromosome 1 (HanXRQChr01g0011241), and the other differentially expressed FAD gene is the most highly expressed FAD gene in our data set discussed earlier (HanXRQChr14g0452931). One differentially expressed KCS gene is also found within a palmitic acid content QTL on chromosome 8. As noted earlier, KCS genes contribute to the production and elongation of very-long-chain FAs and are involved in abiotic stress tolerance (Gu et al., 2020;Millar & Kunst, 1997). The final differentially expressed lipid metabolism gene that colocalizes with an oil QTL is a glycerol-3-phosphate acyltransferase (ATS1) found on chromosome 6 (HanXRQChr06g0172791); this gene corresponds to the rate-limiting enzyme producing lysophosphatidic acid (an important intermediate in the formation of acyl-lipids) in plants (Chen et al., 2011). These results are consistent with the hypothesis that genes affecting seed oil characteristics vary in their expression within cultivated sunflower and may influence seed oil characteristics in wild sunflower as well.
3.4 | Are DEGs related to oil metabolism found within co-expression network modules associated with geographic regions?

| Gene co-expression network analysis
To further examine how gene expression varies between Canada and Texas, we built a gene co-expression network and tested for modulespecific associations with geographic regions. The variance filtering step reduced the number of genes used for module construction to 3,115. These genes clustered into 12 co-expression modules, eight of which exhibited significant differences in expression between regions (  Badouin et al. (2017). QTL from the wild Â cultivated cross were identified by Burke et al. (2005). Genes in bold were found to be differentially expressed in our analyses. Abbreviation: QTL, quantitative trait loci.
The five genes with the highest connectivity within each module were identified as "hub" genes (supporting information Data Set S6).
Many of these hub genes were of unknown function while very few were related to lipid metabolism. Module M7 included two GDSL esterase lipase genes and a non-specific lipid transfer gene as hub genes, while no other module had any hub genes that were seemingly related to lipid metabolism. We then looked for differentially expressed oil metabolism genes within each module, similar to our analysis of oil metabolism genes within QTL. Of the 55 oil-related DEGs, 11 were found within co-expression modules (Table 5). In addition, one of the SAD genes was also found in a co-expression module. Only the FAD gene HanXRQChr01g0011241 was found both within a co-expression module and an oil QTL, but this FAD gene was lowly expressed among all our samples (between 0.5 and 39.1 TPM, average 14.8 TPM).
While genes related to oil metabolism were found within coexpression modules, we did not find large groups of oil metabolism genes that were co-expressed and also varied between regions. This suggests that the variation in seed oil composition between regions is not the result of coordinated gene regulation within a co-expression module. Instead, variation in seed oil composition is likely the result of several independent processes that do not necessarily act in a coordinated manner.

| CONCLUSIONS
Taken together, our results suggest that the observed differences in seed oil composition between regions likely result from the complex regulation of numerous genes involved in FA metabolism. Evidence suggests that the higher expression of SAD genes in Canadian sunflower populations compared with Texan sunflower populations could play a role in altering seed oil composition and contribute to observed patterns of latitudinal variation in oil composition. Furthermore, several other genes related to lipid metabolism are differentially expressed across the range and can be found within oil QTL. While these genes could influence the FA composition of seeds, they could also be involved in other important biological processes related to lipid metabolism. Overall, our results provide insights into the possible role of transcriptomic variation of many genes involved in oil metabolism, and specifically SAD and FAD genes, in shaping variation in seed oil composition across the range of wild sunflower while also providing avenues for future studies of this ecologically and agronomically important trait.
T A B L E 4 Module information from the construction of a gene co-expression network